17 research outputs found
Adaptive Matrix Completion for the Users and the Items in Tail
Recommender systems are widely used to recommend the most appealing items to
users. These recommendations can be generated by applying collaborative
filtering methods. The low-rank matrix completion method is the
state-of-the-art collaborative filtering method. In this work, we show that the
skewed distribution of ratings in the user-item rating matrix of real-world
datasets affects the accuracy of matrix-completion-based approaches. Also, we
show that the number of ratings that an item or a user has positively
correlates with the ability of low-rank matrix-completion-based approaches to
predict the ratings for the item or the user accurately. Furthermore, we use
these insights to develop four matrix completion-based approaches, i.e.,
Frequency Adaptive Rating Prediction (FARP), Truncated Matrix Factorization
(TMF), Truncated Matrix Factorization with Dropout (TMF + Dropout) and Inverse
Frequency Weighted Matrix Factorization (IFWMF), that outperforms traditional
matrix-completion-based approaches for the users and the items with few ratings
in the user-item rating matrix.Comment: 7 pages, 3 figures, ACM WWW'1
GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding
Learning continuous representations of nodes is attracting growing interest
in both academia and industry recently, due to their simplicity and
effectiveness in a variety of applications. Most of existing node embedding
algorithms and systems are capable of processing networks with hundreds of
thousands or a few millions of nodes. However, how to scale them to networks
that have tens of millions or even hundreds of millions of nodes remains a
challenging problem. In this paper, we propose GraphVite, a high-performance
CPU-GPU hybrid system for training node embeddings, by co-optimizing the
algorithm and the system. On the CPU end, augmented edge samples are parallelly
generated by random walks in an online fashion on the network, and serve as the
training data. On the GPU end, a novel parallel negative sampling is proposed
to leverage multiple GPUs to train node embeddings simultaneously, without much
data transfer and synchronization. Moreover, an efficient collaboration
strategy is proposed to further reduce the synchronization cost between CPUs
and GPUs. Experiments on multiple real-world networks show that GraphVite is
super efficient. It takes only about one minute for a network with 1 million
nodes and 5 million edges on a single machine with 4 GPUs, and takes around 20
hours for a network with 66 million nodes and 1.8 billion edges. Compared to
the current fastest system, GraphVite is about 50 times faster without any
sacrifice on performance.Comment: accepted at WWW 201
Towards Neural Mixture Recommender for Long Range Dependent User Sequences
Understanding temporal dynamics has proved to be highly valuable for accurate
recommendation. Sequential recommenders have been successful in modeling the
dynamics of users and items over time. However, while different model
architectures excel at capturing various temporal ranges or dynamics, distinct
application contexts require adapting to diverse behaviors. In this paper we
examine how to build a model that can make use of different temporal ranges and
dynamics depending on the request context. We begin with the analysis of an
anonymized Youtube dataset comprising millions of user sequences. We quantify
the degree of long-range dependence in these sequences and demonstrate that
both short-term and long-term dependent behavioral patterns co-exist. We then
propose a neural Multi-temporal-range Mixture Model (M3) as a tailored solution
to deal with both short-term and long-term dependencies. Our approach employs a
mixture of models, each with a different temporal range. These models are
combined by a learned gating mechanism capable of exerting different model
combinations given different contextual information. In empirical evaluations
on a public dataset and our own anonymized YouTube dataset, M3 consistently
outperforms state-of-the-art sequential recommendation methods.Comment: Accepted at WWW 201
Dynamic Deep Multi-modal Fusion for Image Privacy Prediction
With millions of images that are shared online on social networking sites,
effective methods for image privacy prediction are highly needed. In this
paper, we propose an approach for fusing object, scene context, and image tags
modalities derived from convolutional neural networks for accurately predicting
the privacy of images shared online. Specifically, our approach identifies the
set of most competent modalities on the fly, according to each new target image
whose privacy has to be predicted. The approach considers three stages to
predict the privacy of a target image, wherein we first identify the
neighborhood images that are visually similar and/or have similar sensitive
content as the target image. Then, we estimate the competence of the modalities
based on the neighborhood images. Finally, we fuse the decisions of the most
competent modalities and predict the privacy label for the target image.
Experimental results show that our approach predicts the sensitive (or private)
content more accurately than the models trained on individual modalities
(object, scene, and tags) and prior privacy prediction works. Also, our
approach outperforms strong baselines, that train meta-classifiers to obtain an
optimal combination of modalities.Comment: Accepted by The Web Conference (WWW) 201
Detecting Areas of Potential High Prevalence of Chagas in Argentina
A map of potential prevalence of Chagas disease (ChD) with high spatial
disaggregation is presented. It aims to detect areas outside the Gran Chaco
ecoregion (hyperendemic for the ChD), characterized by high affinity with ChD
and high health vulnerability.
To quantify potential prevalence, we developed several indicators: an
Affinity Index which quantifies the degree of linkage between endemic areas of
ChD and the rest of the country. We also studied favorable habitability
conditions for Triatoma infestans, looking for areas where the predominant
materials of floors, roofs and internal ceilings favor the presence of the
disease vector.
We studied determinants of a more general nature that can be encompassed
under the concept of Health Vulnerability Index. These determinants are
associated with access to health providers and the socio-economic level of
different segments of the population.
Finally we constructed a Chagas Potential Prevalence Index (ChPPI) which
combines the affinity index, the health vulnerability index, and the population
density. We show and discuss the maps obtained. These maps are intended to
assist public health specialists, decision makers of public health policies and
public officials in the development of cost-effective strategies to improve
access to diagnosis and treatment of ChD.Comment: Proceedings of the 2019 World Wide Web Conference. May 13-17, 2019.
San Francisco, CA, US